MentorNet: Regularizing Very Deep Neural Networks on Corrupted Labels
نویسندگان
چکیده
Recent studies have discovered that deep networks are capable of memorizing the entire data even when the labels are completely random. Since deep models are trained on big data where labels are often noisy, the ability to overfit noise can lead to poor performance. To overcome the overfitting on corrupted training data, we propose a novel technique to regularize deep networks in the data dimension. This is achieved by learning a neural network called MentorNet to supervise the training of the base network, namely, StudentNet. Our work is inspired by curriculum learning and advances the theory by learning a curriculum from data by neural networks. We demonstrate the efficacy of MentorNet on several benchmarks. Comprehensive experiments show that it is able to significantly improve the generalization performance of the state-of-the-art deep networks on corrupted training data.
منابع مشابه
How Do Neural Networks Overcome Label Noise?
This work provides an analytical expression for the effect of label noise on the performance of deep neural networks. (a) 5 of MNIST’s 10 classes, with clean labels (b) 20% Random Noise, 100% Network Prediction Accuracy (c) 20% Randomly Spread Flip Noise, 100% Accuracy (d) 20% Locally Concentrated Noise, 80% Accuracy Figure 1: Different types of random label noise. DNNs are extremely resistant ...
متن کاملمعرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی
In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...
متن کاملOn the Importance of Single Directions for Generalization
Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Here, we demonstrate that a network’s reliance on single directions in activation space is a good predictor of its generalization performance, across networ...
متن کاملOn the Importance of Single Directions for Generalization
Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some...
متن کاملOn the importance of single directions for generalization
Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1712.05055 شماره
صفحات -
تاریخ انتشار 2017